Speech production based on lossy tube models: unit concatenation and sound transitions

نویسندگان

  • Karl Schnell
  • Arild Lacroix
چکیده

The discrete time tube model is well established in speech analysis and synthesis providing a simplified modeling of the vocal tract. The standard lossless tube model is extended by introducing frequency dependent losses. In this contribution it is shown how the lossy vocal tract model can be used for speech production. For that purpose the vocal tract areas of this model can be estimated from speech signals by an optimization algorithm. The estimated model parameters can be used successfully for resynthesis. Furthermore speech units are concatenated by transitions of the vocal tract areas. For transitions from one sound to another a nonlinear area transition is proposed which improve the transition in contrast to a purely linear transition of lossy and lossless models. The investigations show that the lossy tube model is advantageous compared to the lossless standard tube model for speech analysis and speech production.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of lossy vocal tract models for speech production

Discrete time tube models describe the propagation of plane sound waves through the vocal tract. Therefore they are important for speech analysis and production. In most cases discrete time models without losses have been used. In this contribution loss effects are introduced by extended uniform tube elements modeling frequency dependent losses. The parameters of these extended tube elements ca...

متن کامل

Text-to-Speech Synthesis using Phoneme Concatenation

We proposed Text-To-Speech (TTS) synthesis system based on phonetic concatenation for unrestricted input text. The input text is first converted into phonetic transcription using Letter-to-Sound rules. For synthesis of a new speech, TTS system selects the recorded phoneme units (PUs) from database and modifies the duration according to the rule based on spelling using Time Domain Pitch Synchron...

متن کامل

Model based analysis of a diphone database for improved unit concatenation

One crucial point of concatenation approaches using diphones is to handle the discontinuities between the concatenated units. This problem is treated by a suitable analysis of the diphones for a parametric synthesis. The model of the parametric synthesis is the lossy tube model, which is an extension of the standard lattice filter considering frequency dependent vocal tract losses. The paramete...

متن کامل

Modeling Co-articulation in Text-to-Audio Visual Speech

This paper provides our approach to co-articulation for a text-to-audiovisual speech synthesizer (TTAVS), a system for converting the input text to video realistic audio-visual sequence. It is an image-based system modeling the face using a set of images of a human subject. A concatenation of visemes –the corresponding lip shapes for phonemes— can be used for modeling visual speech. However, in...

متن کامل

Joint analysis of speech frames for synthesis based on lossy tube models

This paper discusses a model-based synthesis approach focused on the estimation of model parameters. For the treated approach, tube models are used for analysis and synthesis of speech units. In comparison to the standard lossless tube model, an extended tube model is used which includes the frequency dependent vocal tract losses. The parameters of the tube models are estimated by minimizing th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004